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Abstract 

Consider a haploid population which has evolved through an exchangeable reproduction dy- 
namics, and in which all individuals alive at time t have a most recent common ancestor 
(MRCA) who lived at time At, say. As time goes on, not only the population but also its ge- 
nealogy evolves: some families will get lost from the population and eventually a new MRCA 
will be established. For a time-stationary situation and in the limit of infinite population 
size N with time measured in generations, i.e. in the scaling of population genetics which 
leads to Fisher- Wright diffusions and Kingman's coalescent, we study the process A = (At) 
whose jumps form the point process of time pairs {E, B) when new MRCAs are established 
and when they lived. By representing these pairs as the entrance and exit time of particles 
whose trajectories are embedded in the look-down graph of Donnelly and Kurtz (1999) we can 
show by exchangeability arguments that the times E as well as the times B from a Poisson 
process. Furthermore, the particle representation helps to compute various features of the 
MRCA process, such as the distribution of the coalescent at the instant when a new MRCA is 
established, and the distribution of the number of MRCAs to come that live in today's past. 

1 Introduction 

The genealogy back to the most recent common ancestor (MRCA) of those currently alive, and 
especially the time back to the MRCA, has been an ongoing object of interest in mathematical 
population genetics, see |Lit75j . |Cri80j for early references and |Wak05j for a recent monograph. 
The limit of effective population size TV — > oo, with time measured in units of N generations, is the 
scaling in which Kingman's coalescent appears ( |Kin82j ): in the rescaled time measured backward 
from a fixed time the number of ancestral lineages enters from infinity and jumps from fc to fc — 1 
at rate (2). (Here and below we assume that the population size remains constant in time.) The 
depth Dt of the coalescent tree, that is the rescaled time it takes the number of ancestral lineages 
to decrease from cxd to 1, is then a sum of exponentially distributed random variables with mean 

(2) , k — 2,2), . . ., and consequently has expectation 2. 

With the population evolving further, also its genealogical relationships given by the coalescent 
tree change. In this study we are interested in the time evolution of one particular characteristics 
of the genealogy, that is, the time At ^ t — Dt when the MRCA of the population at time t lived. 
We will refer to ^ = {At)teR as the MRCA process. 

At any time t the total population consists of two oldest families, which stem from the two 
oldest lines of descent dating back to the MRCA who lived at time At. These two families will 
coexist for a while after time point t, and during this time interval the path of the MRCA process 
A stays constant. At some random time Et > t, one of the two families will go extinct and the 
other one will fixate in the population. The MRCA of this surviving family must be more recent 
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than At, which amounts to a jump of the MRCA process at time Et. Consequently, at time Et, 
the next MRCA is estabhshed, and the time when this next MRCA hves is Bt := In other 

words, the path of the process A is constant as long as the two currently oldest families coexist in 
the population, and jumps from At to Bt at time Et when one of the two families fixates. 

The MRCA process is embedded in the genealogy of the population which is assumed to evolve 
in a time stationary way. For a finite population consisting of N individuals, a way to construct 
the genealogy comes with the graphical representation of the Moran model: for each ordered 
pair of indices i ^ j E {1, .., A^}, an exponential clock rings at rate 1/2, and whenever this 
happens, the individual with index j dies and is replaced by an offspring of the individual with 
index i. This results in a partitioning of M x {1, . . . , N} into coalescing ancestral lineages, from 
which one can read off a version of the MRCA process for N individuals. This process clearly 
inherits time stationarity from the evolution of the population. 

The Moran dynamics is exchangeable with respect to the individuals' indices. In contrast, 
the look-down process introduced by Donnelly and Kurtz (1999), which is the basic tool in our 
study and will be reviewed in Section [21 arranges the individuals' indices (henceforth referred to 
as levels) at any time according to the persistence of the individuals' offspring in the population: 
the offspring of an individual at level i outlives the offspring of any contemporary individual at 
some higher level. For a finite population number N this is achieved as follows: Each level j 
"looks down" to each smaller level i at rate 1. Whenever this happens, all individuals at levels 
j, . . . ,N — 1 are pushed one level up, the individual at level N is killed, and the individual at 
level i spawns a child at level j. The time stationary MRCA process read off from the look-down 
graph obviously has the same distribution as the time stationary MRCA process read off from the 
Moran graph. 

The look-down process allows a passage to the limit of infinite population size in which the 
ordering by persistence is preserved. The construction of the random look-down graph on K x N 
proceeds in the very same way as described above, except that there is no killing of individuals at 
any finite level. Instead, the offspring of an individual at level i > 2 goes to extinction as soon as 
this line of ascent of the individual is pushed to infinity. All this will be explained in more detail 
in Sectional 

Because of the ordering by persistence, each MRCA of the population lives at level 1 at some 
time B at which it gives birth to an individual at level 2. As soon as the offspring of these two 
individuals fixates in the population, the MRCA is established. Again, because of the ordering by 
persistence, this happens at the time E when the line of ascent which was pushed at time B from 
level 2 to level 3 reaches infinity. The process T := {{E,B)}, which consists of all pairs of time 
points when an MRCA is established in the population and when it lived, is a time-stationary 
point process; we call it the MRCA point process. The paths of A and the point configurations of 
are in an obvious one-to-one correspondence. 

The step from time t to the next MRCA, which is established at time Et and lives at time Bt, 
and an illustration of the MRCA point process are depicted in Figures^a) and^b) respectively. 
In both Figures, the left axis contains the times when MRCAs live, and the right axis gives the 
times when MRCAs are established. The joint distribution of Et and Bt will be given in Theorem 
^in Section Figure ^b) displays part of the MRCA point process Remarkably, not only 
the points B but also the points E form a time stationary Poisson process, see Theorem El in 
Sectional This will be proved by representing the times {B, E), {B' , E'), ... as the entrance and 
exit times of particles: the trajectory of a particle is attached to the line of ascent which is pushed 
from level 2 to level 3 at time B and exits at time E. We will specify the Markovian dynamics 
of this particle system, compute its equilibrium distribution and show that, whenever a particle 
exits at some time E, at this very instant the system of (remaining) particles is in equilibrium. 
This allows to conclude that the waiting time to the next exit time is exponential. The processes 
A and J^, however, are not Markov, see R,emark l4.1l 3. 

In Theorem 3 we compute the distribution of the random number Zt — ^{{E,B) G T\E > 
t,B < t} of MRCAs that are established after time t and live before time t. In particular, it turns 
out that the probability that the next MRCA lives in today's future is P[Zt = 0] = 1/3. 
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next MRCA Bt 



today's MRCA At 





Fi gure 1 '. (a) At time At the MRCA of the population at time t lived, Et and Bt are the times when the next 
MRCA is established and when it lived. (b) MRCAs occur in a time-stationary manner. The dots on the S-axis 
are time points at which MRCAs lived. The dots on the i5-axis are time points at which the MRCA changes. 



As noted by |Taj90| , the amount of polymorphism in a population is related to the fixation 
of alleles. When an allele fixates, the MRCA of the population must have changed. At such a 
fixation time, the full coalescent is unusually short. As neutral mutations fall independently on the 
branches of the genealogical tree, this means that the amount of polymorphism is low at fixation 
times. 

We start out by reviewing the look-down process in Scction|21 describe our results in Scctions|21 
and^ point out some relations to population genetics in Scction[Sland give the proofs of Theorems 
1-3 in Sections EHi 



2 The MRCA process: a look-down construction 

At every time a continuum population which follows a Wright-Fisher (or Fleming- Viot) dynamics 
has a genealogy given by Kingman's coalescent. The look-down process introduced by Donnelly 
and Kurtz ( |UK99| ') not only gives a countable representation of evolving allele frequencies but at 
the same time stores genealogical relationships of all the individuals alive in the population at all 
times. Consequently the MRCA process can be read off from the look-down process. 

The look-down graph: ancestral lineages, lines of ascent and ordering by persistence 

We first give a brief review of the "modified look-down process" f |DK99j '): see Figure |21 for a 
graphical illustration. 

Consider the set of vertices 

V :=]R X N. 

We will refer to the vertex {t, i) as the individual at time t at level i. For each ordered pair of levels 
i < j, let Vij be the support of a (rate one) Poisson point process on R, all these processes being 
independent. (In the terminology of Donnelly and Kurtz, at each time t S Vij, the level j looks 
down to level i.) Based on the processes Vij we will construct a random countable partition Q of 
V, whose partition elements will be called lines. The partition Q will always contain the so-called 
immortal line l defined by 

i. :=Mx{l}. (2.1) 
For each j > 1, any point sq G IJj Vij initiates a line G ^ Q oi the form 



G= ([so,si) X {j})U([si,S2) X {i + l})U([s2,S3) X {j + 2})U 



(2.2) 
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with Sfc+i > Sfe for all k. For a line G as in (|2.2|) with sq G Vij we say that G is born at level j by 
the individual {so,i)- We further say that G is pushed (one level up) at times si,S2, ■ • ■ and exits 
at time Soo(G) := lim„^oo Sn- The times Sk are given for /c = 1, 2, . . . by = infjs > Sk-i : s € 
Ui<£<jri<j+fc-i 'Pirn}- Tlius, a new line is born at level j at each time t when level j looks down 
to some level i < j. Simultaneously, all the lines having occupied at time the levels j,j + 1, . . . 
are pushed one level up. Note that, since the pushing rate increases quadratically in j, the exit 
time Soo(G) is finite a.s. 

For each w S V, we denote by Gy the (unique) element of G that contains v. The forward level 
process Y^{i), t > s, initiated by the individual u = (s,i) is given by 

Y*{i) := level of G„ at time t. 

The line of ascent of individual u = (s, i) is the part of line G„ after time s. that is 

(i,n*(*))s<t<s^(s)- 

We say that a line H descends from a line G if either H ~ G, or there is a finite sequence 
of lines Gi, . . . , G„_i G Q such that Gk is born by an individual in Gk-i , A: = 1, . . . ,n, where 
Go := G and G„ := H. 

The backward level process X*(j), s < of an individual v ~ {t,j) arises by tracing back the 
level of Gv to the birth time of Gu, then jumping to the level of the individual u from which G„ 
was born and tracing back the level of G„ to the birth time of (s, z), and so on. 

The ancestral lineage of the individual v — {t,j) is 

(s:-'^s(j))-oo<s<t ; 

note that eventually all ancestral lineages coalesce with the immortal line. 

We say that an individual v G V descends from an individual u G V (or equivalently. u is an 
ancestor of v) if u belongs to the ancestral lineage of v. The random tree spanning V which is 
obtained in this way is the random look-down graph. 

Let u = {s,i) and v = (s,j) be two individuals living at the same time s, with i < j. By 
construction the line of ascent of v is pushed whenever the line of ascent of u is pushed, hence 
Y*{i) < Y*{j), and the line of ascent of v exits not later than that of u. In this sense, the ordering 
of lines by contemporaneous levels is an ordering by persistence. Note also that for all times s < t 
and all levels i E N: 

y,*(z)=inf{j gN:XKj)=0. 

Thus, the time when an individual's line of ascent reaches infinity marks the time at which the 
individual's offspring goes extinct. 

Let us note in passing that the ordering by persistence is a main distinction between the version 
of the look-down process developed in |DK99j and its precursor introduced in )DK96| . In the latter, 
the order by persistence is only stochastic, that is, lines of ascent of contemporaneous individuals 
at lower levels are longer "in probability" . In the modified look-down process of )UK99| , explained 
and employed in the present paper, this property holds almost surely. 

Coalescent curves and fixation curves 

For t G M and i G N the coalescent tree £*(i) consists of the ancestral lineages of the individuals 
(t, 1), , (i, i), i.e. Xl{l), . . . , Xl{i) for s < t, whereas the full coalescent tree is made up of the 
ancestral lineages of all individuals living at time t. All these lineages eventually coalesce with the 
immortal line. Since any pair of ancestral lineages coalesces at rate 1, £*(«) and £* are distributed 
like Kingman's (finite respectively infinite) coalescent. The number of lineages remaining at time 
s < t can be expressed as 

G*(*):=max{X*(l),...,X*(*)}, G* := G*(^) sup G*(*). (2.3) 

iGN 
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Figure 2: Detail of a look-down graph. Time is running upwards; all lines at the first 8 levels are drawn between 
times s and t. At times in Vij an arrow is drawn from i to j. All lines at levels at and above j are pushed 
upwards as indicated by bent lines. The solid marked line is the fixation curve Fg,T > B. The dotted line is the 
coalescent curve C*(8),r < t. The dashed line is the line born at level 4 by the individual (so,l); it is pushed 
one level up at times si,S2,.... In this picture, = ... = X*(5) = X*(7) = 1 and X*(6) = X*(8) = 2; 

y/(l) = 1, y,*(2) = 6, K,*(3) > 8; C*(l) = . . . = C*(5) = 1 and C^e) = . . . = C^(8) = 2. 

In words, C*(i) is the number of time s-ancestors of the time i- individuals at levels 1, . . . , z, and C* 
is the number of time s-aneestors of the whole population at time t. For fixed t, we call (C*)s<t 
the coalescent curve in the look-down graph back from time t. It is distributed like the death 
process in Kingman's coalescent entering from infinity. 

The time when the MRCA of the total population at time t lived is 

At := sup{s : = 1}. 

All individuals at time t descend either from individual {At, 1) or from individual {At, 2). At time 
At a line must be born at level 2, which is equivalent to At £ Vi2- Denote the next point in 7^12 
after At by Bt: 

Bt := min{s e V12 : s > At}. 

The offspring of the two individuals {Bt, 1), {Bt, 2) evolves towards fixation in the population by 
pushing the line of ascent of the individual {Bt,3) towards infinity. The time Et when this line of 
ascent exits equals the time when the offspring of the individual {At, 2) is expelled by the offspring 
of {{Bt, 1), {Bt, 2)}. Thus the time Et is the first time after t when a new MRCA is established, 
and the time when this MRCA lives is Bt- 

Note that at any time r between Bt and Et, all the levels 1, . . . , ^^(3) — 1 are occupied by 
offspring of {{Bt, 1), {Bt, 2)}, whereas level ^^(3) is not. We therefore call 

Fk ■■= (3) - 1 = Yl (2) - 1, Bt<T<Et, (2.4) 

the fixation curve starting in time Bt- (For the equality in (|2.4|l . note that the line containing 
{Bt,3) was born at time At at level 2 and was pushed to level 3 at time B.) When Yg^{3) = k 
the corresponding line moves to fc 4- 1 at the next look-down event among the first k levels, i.e. at 
rate (2) ■ As a consequence, F^^ is pushed from level k to level fc 4- 1 at rate {''2^) ■ 

The MRCA point process ^ records all the time points when the fixation curves start and end. 
We will pursue this in Section^] by constructing an autonomous particle system whose trajectories 
give the fixation curves. 

Whereas the coalescent curves are constructed from any t backwards in time, the fixation 
curves start only at points in P12 and are constructed forwards in time. 
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At a time E when a fixation curve ends (and a new MRCA is established), all individuals 
descend from the MRCA who lived at the time B when this fixation curve started. Hence the 
fixation curve between time points B and E equals the coalescent curve back from time E. With 
time proceeding, the coalescent curve evolves, being more and more "zipped away" from the upper 
end of the fixation curve (near time point E), and still sharing the lower part (near time point B) 
for a while. 

Having now constructed the process A in terms of the look-down graph, we will study its 
properties in the next sections. 



3 From today to the next MRCA 

As in the previous section, At denotes the time when the current MRCA lived, Et is the time 
when the next MRCA is established and Bt = Ae^ is the time when the next MRCA lives. In this 
section we will compute the conditional distribution of {Et, Bt) given At. 
The following random variables will play a crucial role: 

Lt ■■= , (3.1) 

the level at time t of the fixation curve starting at time Bt, and 

It := , (3.2) 

the level at time Bt of the coalescent curve back from time t, where we define 

Li := 1 and /t := oo on the event {Bt > t}. 

Without loss of generality, and to ease notation, let us put t = 0, and write L := Lq,! := 

lo, E := Eo,B := Bo- 
Note that, because of the ordering by persistence, the lines of ascent starting at time from 

levels 1, . . . ,L exit only after time E, whereas the lines starting at time from levels L + 1, L + 2, . . . 

exit at time E or earlier. Thus, L is the random number of individuals in the present population 

that still have offspring when the next MRCA is established. 

Proposition 3.1. The pair {L,I) is independent of Aq and has distribution 

( e-i 



e>2,i>3 



0, else. 



(3.3) 



Proposition 13 . II will be proved in Sectional 
Remark 3.2. 1. Summing over i in (|3.3|l leads to the distribution of L: 

P\L = i]^- ^, -, £^1,2,... (3.4) 

Since {L = 1} = {B > 0} is the event that that first fixation curve which ends after time 
has not yet started by time t = 0. we infer that the probability that the next MRCA lives 
in today's future is 

P[B > 0] = P[L = 1] = 1/3. 

2. Here is another quick way to (|3.4(l . exploiting exchangeability. Recall that the number L 
gives the number of lines that still have offspring at the time when the next MRCA is 
established. At any time there are two oldest families in the population. The family sizes 
of these two oldest families, denoted by P and I — P, evolve according to a Wright-Fisher 
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diffusion. It is well known (and can be understood from the Polya urn scheme embedded in 
the genealogy; see e.g. facts about the Polya-Eggcnberger distribution in |JK77j . eq. (4.1)) 
that, at any fixed time, say at time t = 0, P is uniformly distributed on [0,1]. This also 
remains true conditioned on the event ~ —d. By exchangeability, the probability that 
the first £ most persistent lines are in one and the {£ + l)-st most persistent line is in the 
other family is 

To prepare for Theorem ^ we need one more bit of notation. 

Definition 3.3. Let Tk be independent exponentially distributed random variables with parameter 
(2)' = 2, 3, . . and 

3 

si ^ J2 1 < « < J < ^- 

k=i+l 

Ford > and i = 1,2, .. let Rid be a random variable whose distribution equals the conditional 
distribution of given that SI + S°° = d. 

The random variable Sj represents the time which Kingman's coalescent requires to come down 
from j to i. Consequently, Ri^ refers to the random time for a coalescent to come down from 
infinity to i lines, given that coming down to 1 line requires exactly time d. Note also that Sf 
represents the time a fixation curve needs to be pushed from level i to level j. This can be seen 
because a fixation curve goes from level i to level £ + 1 whenever a look-down event among the 
first £ +1 levels occurs, i.e. with rate (^^^) . 

We are now prepared to state Theorem^ which together with Proposition l3.1l viclds the desired 
conditional distribution of {E, B) given Aq . 

Theorem 1. Let L and L be as in H3.1|l and (|3.2|) . The conditional distribution of {E,B), given 
Aq — —d, L = £ and I = i is represented by the random variables 

{Sf + S^,S^i) if£ = \, 

(3.6) 

(5,°°,-i?,-d) if£>l, 

where (Sj) and Ri^ have the distribution specified in Definition \3.S[ and Ri^ and S^ are inde- 
pendent. 

The proof of Theorem ^ is given in Section |S1 
Remark 3.4. 1. Combining (|3.6|) and 1)3. 4|) we obtain 

F[E€ds\A, = d]=J2j^^-^^^^^P[Sr eds], s>0. (3.7) 

From this one can conclude that the conditional distribution of E given Aq is standard 
exponential. Indeed, think of a 2-sample (i.e. a subsample of size two) embedded in a 
full coalescent. The coalescence time of this 2-sample is standard exponentially distributed. 
Denoting by L' the number of lineages remaining in the full coalescent at the time when 
the 2-sample has found its common ancestor, one sees from |GT03| . eq. (2.10) or |ST W84| . 
Lemma 3, or by direct calculation, that L' has the same distribution as L specified in (|3.4|l 
and (|3.5|) . This shows that the r.h.s. of (|3.7|l is a decomposition of the standard exponential 
distribution. 
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2. Here is another quick (though sHghtly informal) argument that the waiting time to the 
next jump of the MRCA is exponential, independently of the depth of the current MRCA. 
Note first that, conditioned on Aq — ~d the split of the population size into the two oldest 
families at time i = is miiformly distributed on [0, 1]. As a consequence, given the MRCA 
does not jump during the time interval [0, s], the split remains uniformly distributed also at 
time s. (This corresponds to the fact that the uniform distribution is a quasi-equilibrium 
for the Wright-Fisher diffusion.) At the next jump of the MRCA process one of the two 
oldest families dies out. After the jump there will be two families inside the surviving family 
that again make up a uniform split. This implies that the time between jumps proceeds in a 
memoryless manner, showing that the conditional distribution of Eq given Aq is exponential. 

Notably, the fact of exponential waiting times between the jumps can also be read from (3.10) 
in |Wat82aj . See Section [S] for comments relating to this paper and to other applications. 

4 A particle representation of the MRCA point process 

The set Q of lines defined in Section |21 randomly partitions the set V = R x N. Let us write 

02 {G e e I G is born at level 2}. 

For each line G € Q2 write B B{G) for the time when G G is pushed from level 2 to level 
3 (due to the birth of the next line in Q2) and E := E{G) for the exit time of G. Thus we obtain 
a one-to-one correspondence between Q2 and the sequence of fixation curves by associating with 
any G G C/2 the fixation curve Fb starting at time B and ending at time E. This fixation curve 
is related to the level path of G by = Yg(3) - 1 for B < t < see ^T^. The MRCA point 
process T then can be written as 

J'^{{E,B)\G^Q2]. 

Additionally, we write 

T] := {E\{E,B) ^ T} and ryt := r/ n (-00, t] 

for the exit time point process and its restriction to (— oo,t] respectively. 

In this section we will gain more information about the processes J- and i] by interpreting the 
fixation curves as the trajectories of an interacting particle system on {2, 3, 4, . . .} whose dynamics 
and equilibrium distribution we will compute. 

Let 

Zt := #{{E, B) eT\E >t,B <t}. 

In other words, Zf is the number of fixation curves present at time t, that is, the number of 
MRCAs which will be established after time t and have lived before time t. 
Write 

> > . . . > Lf > 1 (4.1) 

for the levels of the fixation curves at time t. Let us interpret (L], Li, ... ^ ^f') ^-s a configuration 
of particles on the set of levels {2, 3,4,.. .} at time t, and put 

A, :=(LJ,L2,...), A:=(A,), 

where L-j. := 1 for j > Zt. The first components in the MRCA point process are the exit times 
of the "leading particles", i.e. those time points E where limtf^jL^ = 00. Whenever a particle 
exits, the indices of all remaining particles are shifted down by one: 

(ii„L|,...):=(L|_,L|_,...). (4.2) 

Here is a verbal description of the dynamics of the particle system (see Proposition l7.1l for a formal 
statement): Particles are pushed in at level 2 at rate 1, each particle at level £ > 2 is pushed one 
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Figure 3: The embedding of the fixation curves in the look-down process. At times B fixation 
curves start and at times E they end. In this example, at time s the number of particles in the 
system is Zg' = 2, the leading particle being at level 6, and the second particle at level 2. 



level up at rate (^^^), and this is done in a coupled way such that, whenever a particle is pushed, 
all particles at higher levels are pushed simultaneously. The next theorem specifies the equilibrium 
distribution of A. We will see that this distribution prevails also in the distinguished random time 
points E where limtiE L\ = oo. This property is crucial to see that 77 is a Poisson process. 

Theorem 2. 1. The process A = (At) is Markov with stationary distribution 

2 




fe+ij 



«/4 > 1, 



7rA(4,^2, ...)=<( ^ rZli ^ ~ (4.3) 



In particular, the stationary distribution of is 

'^-(^) = (ZTWT^- ^'-'^ 

2. The process of exit times rj is a stationary Poisson process. 

Remark 4.1. 1. The "arrival time points" B of the particles in the system A (the times when 
the MRCAs live) are the points of the stationary Poisson process 7^12 . Theorem 12 states 
that also the "departure time points" E (the times when the MRCAs are established) form 
a stationary Poisson process. Thus, the Poisson input process of times B when the fixation 
curves begin is transformed by a "dependent stochastic shift" into the Poisson output process 
of times E when they end. This is similar to Burke's theorem which states that the departure 
process in a time stationary M/M/1 queue is Poisson, see |Kur98| and references given there. 
A crucial property (proved already by Burke (1956)) is that in a stationary M/M/1 queue 
the distribution of the queue length at time t is independent of the departure times < t. 
In the language of queueing theory, our particles correspond to customers entering at the 
time points of a stationary Poisson process, and the time which a typical customer spends 
in the system is distributed like specified in Definition 13.31 These times are mutually 
dependent. As the proof of Theorem |21 reveals, like in Burke's theorem the state of the 
system (now given by the configuration A^ of particles at time t) does not depend on the 
departure times < t. 

2. We recently learned from Tom Kurtz about the manuscript |DK06| where he and Peter 
Donnelly have established the filtered martingale problem for the A^-level analogue of the 
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Figure 4: Assume we know At = As = a for some s = a + e. This knowledge leads to a higher chance of the 
MRCA time Bt falling between times a and s than in an equilibrium situation. This shows that the future of the 
process A at time t depends on the past and A cannot be Markov. See text for explanation. 



pair (A, 77) in the context of |Kur98j . Theorem 3.2 and thus achieved an alternative proof of 
the fact that the "MRCA fixation process" 77 is Poisson. 

3. Whereas the particle process A is Markov, the MRCA process A is not. This can be seen as 
follows: 

Let a, s,t be as in Figure |31 Conditioned on = a we obtain from Theorem 1: 

P[Bt < s\At = a]= P[Ri^t-a >t~s]^0. 

On the other hand, we claim that P[Bt < s\At = A^ = a] does not converge to zero as s J, a, 
which shows that A cannot be Markov. To verify the claim, we write, using Bayes' rule 

TJIR ^ lA A 1 < s,At^As\As=a] 

ir[Bt < s\At = As = a\ = 



P[At = As\As = a] 



By Theorem n the denominator converges to e *■* as s | a. Likewise, the numerator is 
bounded away from as s J, a, a trivial lower bound being 



P[L, = 2]P[Sl>t- s] = ie~3(t-.) > i 



-3(t-a) 



6 - 6 

4. The level Ll of the leading particle in (|4.1|) coincides with Lt defined in H3.1|l . Thus we 
recover (13. 4f) from (|4.4|l . 

Recall from (|4.1|1 that 

Zt = max{j e N| > 1}, 

where max0 := 0. Consequently. 

{Zt = 0} = {Ll = 1}. 

This is the event that there is no particle on {2, 3, 4, ..} at time t, or equivalently. that all fixation 
curves starting before time t also end before t. Given this event, the fixation curves starting before 
time t are independent of those starting after t. 

In a way, the random variable Zt of MRCAs that are established in today's future and live in 
today's past measures the dependence between past and future in the MRCA process. Note also 
that because of Theorem 3 the distribution of Zt does not change when t is conditioned to be the 
time of an MRCA change. 

In the next theorem we calculate the equilibrium distribution of Z := Zt- 

Theorem 3. 1. The probability generating function of Z is 

1 jt{i + l) + 2{u-l) 



E[ti^] = -expf^lo; 



3....^^.3g^ (* + 2)(7-l) 
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2. The expectation and variance are 

E[Z] = 1, Var[Z] = 14 - fyr^ « 0.84052. 

3. The probability weights are given by 

a:^ iai—z j — 1 

where the ai G No a?^cf 

1 1 
6,- := IH r H r. 

T/ie weights for z = 0, 1, 2, 3 are 

P[^ = o] = i 

P[Z = 2] = — _ ^ 0.19664, 
^ ^ 243 81 

5 Relations to population genetics 

Consider sequence data, obtained from a sample of individuals in a population that reproduces 
according to Wright-Fisher dynamics. Besides resampling we consider neutral mutations for an 
infinite sites model (as introduced in jKim71j ) occurring at rate 9/2 along each line. Using common 
notation in population genetics, we consider the diffusion limit of the dynamics of the population, 
where time has been rescaled by a factor A^, the number of haploids in the population. The per 
generation mutation probability of fi along each line is rescaled to 0/2, where 9 = 2N ^. 

Mutations can also be modelled in the look-down picture: for each level there is an independent 
Poisson clock with rate 9/2 by which mutations on the line carrying the corresponding level 
accumulate. This implies that on each line of the lookdown process mutations arise at rate 9/2. 
All mutations an individual carries at time t are collected along its line of descent. 



P[Z = 1] = « 0.40740, 

P[Z = 3] 0.05246. 
^ ^ 2187 243 



Segregating sites 

For two individuals sampled from the population, the expected number of segregating sites is 
6'E[rc], where Tc is the random time to coalescence of the individuals' ancestral lineages. This 
time is unusually short at instances when the MRCA changes. In |Taj90| , Tajima studied the 
coalescent at such times. He concluded that then the coalescence rate from A; to fc — 1 ancestral 
lineages is C^^^), his argument being that, in addition to the k lineages, there is one extra line, 
which apparently must belong to the family that disappears at the time of the MRCA change. 

These coalescence rates can also be seen from the particle representation of the MRCA process. 
In fact, the fixation curves give the shape of the coalescent tree of the whole population back from 
the time of the MRCA change. Recall that the fixation curve moves from level k io k + 1 at rate 
C^^^). Consequently the time the coalescent back from some time point E stays with k lines is 
exponentially distributed with rate C^^^), which means that this is the rate to go down from k to 
k — 1 lineages. As these rates differ from the rates in Kingman's coalescent the random coalescence 
time Tc of the 2-sample cannot be exponential. However, the 2-sample coalescent is embedded in 
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the full coalcsccnt; the probability that the two sampled lines find a common ancestor at the time 
when there arc £ lines left in the full coalcsccnt is (see Remark 13.41 or |Taj90| , equation (6)) 

2 

{£+!){£ + 2)- 

As the time of going down from infinity to £ lines in the coalcsccnt at an MRCA time is distributed 
like S^i, we obtain the distribution for the coalescence time Tc of the two lines 

oo 2 

Taking expectations we obtain 

EfT^c] = E (^+i)^(^ + 2)£!l = ^£ ( (TTTF " ii+m + 2)) - l'^' - 6 « 0.58, 

a result already obtained in |Taj90| . As the coalescence time for two lines in equilibrium is 
exponential with mean 1, this result means that the expected number of segregating sites for a 
2-sample is reduced by 42% at times when the MRCA changes. 

For samples of arbitrary size, the number of segregating sites is Poisson with mean 9/2 times 
the total branch length of the sample's genealogical tree. In |RB Y04| . Figure 2c, a path of the 
time evolution of this total branch length is depicted for a spatial and a "well-mixed" population. 
At certain instances, one sees sudden substantial decrease of the path length. One may guess that 
this happens primarily at times at which the MRCA changes, since then the coalescent tree is 
unusually short. 



Substitutions 

Most mutations that occur in a population are quickly lost. However, some eventually fixate, i.e. 
all individuals in the population carry the new mutation. This replacement is termed a substitution 
and the corresponding mutations are called determining mutations. In |Wat82a] and |Wat82b] . 
Watterson studied several aspects of the process of substitutions. While we are concerned with 
the jump from today's MRCA to the next one, Watterson fixes two time points and t and studies 
the time between the MRCAs at these times, i.e. At— Aq, irrespectively of the number of MRCAs 
that are established between and t. All mutations on the ancestral line between At and Aq 
are then determining mutations and their number gives the the number of substitutions between 
times and t. 

The only way a mutation can become a substitution is through an MRCA change. This is 
because any mutation that occurs in the population belongs to one of the two oldest families. For 
the mutation to become fixed it is necessary that the family not carrying the mutation dies out. 
In other words, it is necessary that the MRCA changes. 

Consider the graphical lookdown representation including mutations falling on lines at all levels 
at rate 0/2. A mutation that occurs is determining if and only if it occurs on the line at level 
one. Indeed, we already found that MRCAs of the population as seen in the lookdown picture 
always are at level one. On the other hand, given the time point of a mutation on a line at level 
one, eventually all individuals in the population are descendants of the individual carrying this 
mutation which shows that all mutations that occur at level one are determining. 

Denote hy S ~ {{E,S)} the process of times {E} and number {S} of substitutions at these 
times. As times E of MRCA changes are the only ones that can be substitution times and the 
number of mutations on a line is Poisson distributed with rate | we find that the process 5 is a 
close relative to the MRCA point process: 

Proposition 5.1. Let {{E,B)} be distributed as the MRCA point process. Additionally, for 
all successive pairs {E',B') and {E",B"), let S" be Poisson-distributed with intensity parameter 
- B'). Then {{E" , S") : S" > 0} is a version ofS. 
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Fixation curve 

I L = 4 



T2=B 




t = 



Figure 5: The variable l'' is the level of the coalescence curve back from time t ~ when the next 
fixation curve has reached level k. The corresponding real times are denoted by t^. The variable 
is the level of the fixation curve when the coalescent curve has reached level j. In this example, 
/2 ^ 4^ /3 ^ 6, 8, A' 2 A' 3 ^i^K^ ^ = 2, = A'^ = 3 and = 4. 



This confirms the observation in |Wat82bj that (i) substitution times do not form a Poisson 
process and (ii) substitutions tend to occur in clusters. 

6 Proof of Theorem [1] 

Recall the definition of Lt and It in H3.1|l and H3.2|) . and also recall that we put without loss of 
generality t = 0, omitting the corresponding sub-and superscripts 0. By definition, the fixation 
curve Fb starts at time B at level 2 and exits at time E at level cxd; let us now extend this 
definition by putting 

Fl:=l if r < B. 
The following auxiliary variables will be helpful: 

IV := the level of Fb while C = j = 2, 3, . . . 

and 

I^ := the level of C when Fb reaches level fc, fc = 2, 3, . . . 
Formally, putting 



we have 



Thus, T2 = B, P ^ /, 



Tk:=M{T:F^ = k}, fc = 2,3, 

jk^iCr,, ifTfc<0, 

1 oo, if Tfe > 0. 



=^ max{fc : r < j} and L = K°° := lim . (6.1) 



The random variables I^ , and L are illustrated in Figure 
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Lemma 6.1. K ~ {K'^, , . . .) is an inhomogeneous Markov chain starting in = 1 and with 
transition probability given by 

P[if^+i = fc + = fc] = j-^ = 1 - P[/i^+i = k\K^ =k], j >k> 1. (6.2) 

Moreover, K is independent of the coalescence curve C . 

Proof. When the coalescence curve moves from j + 1 to j at some time s, that is, = j + 1 
and Cs_ = J, then some look-down event involving two levels < fc + 1 must happen. When at 
time s— the fixation curve is at level fc, the probability that the fixation curve jumps at time s 
from fc to fc + 1 is (''J^)/(^t^) ' 

since all possible look-down events are equally probable and there 
are (^^^) events that push the next fixation curve one level up. Observe that this is independent 
of the prehistory ^ . . . ,K^~^, independent of the coalescence curve C, and in particular also 
independent of the time Aq. □ 

In the next lemma we calculate the joint distribution of the random variables I*^. 
Lemma 6.2. The joint distribution of (/^, /'^, . . .) is given by 

, 1 



ii, r+' = . . . = oo = — - 



n 



1 



+ m){im + m - 1) 



(6.3) 
(6.4) 



for 2 < i2 < ■ ■ ■ < ii- 

Proof. The event {/^ = oo} equals the event that the next fixation curve has not yet started by 
time 0, that is the event {B > 0} = {Kj = 1 for all j = 1, 2, . . .}. Thus, using (|6.2|) . 



p[/2=c^]=n(i 



n 



(j + l)(j--2) 1 



3' 



i=3 y-ii j=3 

since the product telescopes. This shows (16. 3|) . To prove H6.4() . we express the event on its left 
hand side in terms of the variables K^: 



oo} 



K 



i2-l _ 



1,. 



Putting ii = 2 and i^+i = oo, and using Lemma l6. II we arrive at 



«2, 



7-^+1 



n n_(-V) 



7n—l j—ij^-^-l 



n 



1 /m+l\ 
2 / 



1 



£\i£~iy. 
£\ii-iy. 



n n 

rn— 1 j—i^-i-l 

i-1 



n 



ii - m-l){j + to) 
{hn - m) ■ ■ ■ (i^ - 1) 



n 



■Jj^ (*m+l)(*m+l — 1) 



m+l 



1 



TO — 1) • • • (i 



m + l 



2)(z 



m+l 



1) 



«m+l • ■ • {im+l + TO - 1) 



(le + £)iie + i - I) 



(im + 1) ■ • ■ (im + TO)im+l 

n 



jii-i)--- - 1) 

(i^ + 1) • ■ • {ii + £) 



1 



(im + TO)(im + TO - 1) 



n 



'■f, (^m + Tn)[i,r, + ?n - 1) 



□ 
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From the joint distribution of . . given in Lemma [6.21 we obtain beeause of Ht).l|) the 

joint distribution of [L, I) by projection: 

Proof of Proposition 1^711 

Because of {L = 1} = = oo}, we obtain the assertion of H3.3|l for ^ = 1 from (|6.3() . For 
^ = 2, 3, .. we proceed by induction. For £ = 2 we have, using again Lemma 16.21 

P[/ = i,L-2]-P[/2 = i,/3^oo] = ^- 



3 {i + 2){i + l) 
If the assertion is true for all 2, . . . , ^, we have 

i<i3<---<il+i 

(£ + 1)!^! 



oo 



(i + 2)(i + l)^ 3 A-L (i„, +TO)(i,„ +m- 1) 

i<i3<---<i«+l m=3 



+ 2){i + 1) ^ + 3)(j + 2) ^ ^ 3 (ira + m)(i„ + m - 1) 



« + l 3 i + 2 i + l ^ 7 



(i + 2)(* + l)^ L . , . 3(z + 2)(z + l) ^ (j+2)...(j+^+l) 



1)^ 



3(i + 2)(z + l)^ (j+2)...(j+^) (j+3).--(j+^+l) 



3(i + 2)(t + l) (^ + 3)•■•(^ + £+l) 

and we are done. □ 
We turn now to the 

Completion of the Proof of Theorem 1 

Given {L = 1} = {B > 0}, and independently of C (and therefore also of Aq), the time B it takes 
to enter the next fixation curve is standard exponentially distributed (and therefore distributed 
like Si ~ exp(l)), and the additional time it takes this fixation curve to exit is distributed like 
5^, and is independent of B. 

Given L = £>2, I = i<oo and Aq = — d, the time at which the coalescent curve jumps from 
level i + 1 to i is distributed like —Ri^d- By construction, this is also the time B at which the next 
fixation curve Fb enters. At time 0, this fixation curve is at level £; independently of the past, the 
time it takes until this fixation curve exits is distributed like S'^ . □ 

7 Proof of Theorem [2] 

First we give a formal description of the dynamics of the process A. Afterwards we derive its 
equilibrium distribution, and finally we show that this equilibrium distribution also prevails at the 
distinguished times E. 

The dynamics of A 

Assume Zt ^ k with Lj = £i, . . . , = £k > L^'^^ = 1, i.e. at time t there are exactly k particles 
at levels > 1; in other words, exactly k fixation curves are present at time t. Assume level j looks 
down to level i for i < j. li j > £i + l only lines at level greater than £i + l are pushed. In this case 
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no particle moves, i.e. Lt stays constant. When j < + 1, at least the level of the next fixation 
curve increases by one from £i to + 1 and the corresponding particle moves. The rate of these 
events is (f^2^) which equals the rate at which a fixation curve moves from ii to £i + 1. When j 
is at most £2 + I, also the position of the second fixation curve is increased and the corresponding 
particle moves. 

As look-down events among the first ^1 + 1 levels occur at rate {^^2^)-, this is also the rate 
at which the first particle moves. To be exact, with rate C^^^) — {^^2^) the first particle is 
affected, with rate ('^J^) — (^^^^) the first two particle move and so on. Additionally, at rate 1, a 
look-down event from level 2 to 1 occurs which has the effect that a new particle enters at level 2, 
i.e. and L^'^^ moves from level 1 to level 2 and all particles at levels greater than 1 move as well. 

The first particle moves at a quadratic rate and thus reaches infinity within finite time. When 
it hits infinity at time E the fixation curve is completed and L'^ = L'^^ for fc > 1 as stated in 
(j4.2(l because at time E the second particle becomes the leading one. 

The just stated arguments prove the following proposition describing the dynamics of the 
process A. 

Proposition 7.1. From At ~ (^i,4, • ■ transitions occur 

f to (4 + 1, . . . ,4 + 1,4+1, ■■■) at rate - if £k > 1, 

\to{£i + l,...,£k + lJk+i,---) atrate 1 i/4-i>4 = l. 

To derive the equilibrium distribution of the particle system A, it will be helpful to compute 
the one-time distributions and the limiting distribution of the Markov chain K from Lemma l6.ll 

Lemma 7.2. 

P\K^ = k] = ^-^-^ r, j > fc > 1. (7.1) 

^ J j-l(fc + l)(fc4-2)' - ^ > 

Proof. To prove 1)7. 1|) . we will proceed by induction. Because of P[A'^ = 1] = 1, the formula is 
true for j = 2. From (|6.2|) we obtain the induction step: 

P[X^-+i = fc] = P[K^ = fc] (1 - IJIfl) + P[K^ =k-l]-^ 

\ 2 ) \ 2 I 

^ j + l 2 (j + l)j-(fc + l)fc 7- + 1 2 fc(fc-l) 

j-l(fc + l)(fc + 2) + j-lk{k + l){j + l)j 

= JjhjjJkTWTY)^^' + " + + -'^^'^ 

j + 2 2 



J (fc + l)(fc + 2)- 
We are now ready for the 



□ 



Completion of the Proof of Theorem 2 

We will briefly write (L^, L^, . . .) := (Lq, Lq, . . .). Observe that equals the level L which the 
fixation line entering at time B has reached at time 0. From (|6.1|) and Lemma |7. 21 we thus infer 
readily that 

P[^--^-]- (4 + l)(4 + 2) ' (^-2) 

which proves (|4.4|l and also re-establishes H3.4|) . 

Next we compute the conditional distribution of L'"'+^, given i*^ = £k,. ■ ■ , = £1, where 
2 < £k < ■ ■ ■ < £i- Consider the fc-th particle, i.e. the particle which has level £k at time 0, 
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and denote the time at which this particle entered at level 2 by Bk- Since the trajectory of this 
particle between times and is an initial piece of the coalescent curve (belonging to the exit 
time of this particle), and since L*^"*"^ is the level of the next fixation curve while this coalescent 
curve has level £k, we can apply Lemmata 16. II and 17.21 to the trajectory of the {k + l)-st particle, 
parametrised by the levels of the fc-th particle's trajectory, to conclude that 

Iterating this we obtain 

P[Li=£i,...,L'= = 4,L'+' = l] 

2 £i + 1 2 £2 + 1 2 ^fc + 11 

~ {£1 + l)(£i + 2) ^1 - 1 {£2 + 1)(^2 + 2) 4 - 1 " ' (4 + 1)(4 + 2) 4 - 1 3 

3-y(£,+2)(£,-l) 
This shows ()4.3|l . 

In Section we argued, by disentangling the combinatorics from the time embedding, that 
i = ii is independent of the coalescence curve C = C°. The same argument shows that At is 
independent of {C*,r]t), that is. both the coalescent curve back from time t and the exit time 
points before t. 

We claim that this assertion remains true conditioned on {t e ?/}, i.e. the event that t is an 
exit time. Indeed, given {t G r]} we know that t is the exit point of a fixation curve, which hence 
must coincide with the coalescence curve C*. So the above argument shows that also under this 
additional conditioning the particle configuration At is independent of C* and rjt- 

Now we turn to assertion 2. of the theorem. Consider a population in equilibrium. We know 
already that At is in equilibrium, i.e. has distribution tta, independently of the exit times of 
particles before t and no matter if t is conditioned to be an exit time or not. This proves that 77 is 
a stationary renewal process. Additionally we know that waiting times between points have the 
same distribution as the waiting time out of equilibrium. Thus the waiting times are memoryless, 
hence exponential, and rj is Poisson. □ 

Remark 7.3. Here is a more heuristic way (in the spirit of Remark 13.21 2) to see the identity 

Equation (|3.5|) says that L = Li has the distribution of the initial run length R in a coin 
tossing with random, uniformly on [0, 1] distributed success probability. Similarly, equation (|7.3|l 
says that 

given Li = £1, the random variable L2 is distributed like R conditioned to {R < £1}. (7.5) 
This is readily seen because 



P[R<£] = l-2j\'dp^^j-^, 



and consequently 

P[R = £2\R<£i] = 



^i-l(^2 + l)(4 + 2)' 

The property H7.5|l can also be understood as follows: = £1 is the number of currently living 
individuals that still have offspring at the time Eq of the next MRC A change and is the number 
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of individuals still having offspring at the time Eeo of the next but one MRCA change. At time 
Eq one of the two families which were the oldest at time 0, dies out, and our condition is that the 
£i individuals at time have offspring in the surviving family. This surviving family will again be 
made up of two oldest subfamilies, whose sizes again constitute a uniform split of [0, 1]. At the 
time of the next but one MRCA change after time 0, at least some of the £i lines must have gone 
extinct, which amounts to the condition that not all of them belong to the same subfamily. The 
number of lines that belong to the surviving subfamily thus has the same distribution as R 
conditioned to {R < ^i}. 



8 Proof of Theorem [HI 

By the definition of Z we immediately see from Theorem |21 that in equilibrium 

l=i?, + i<£,<...<fi !<£;,<. ..<eij=l ^ ■' ' 

This is the basis for the proof of Theorem |3| We first show that the correct weights of the 
distribution of Z are given by 3. The weights from 4. are just an application of this. From the 
weights we compute the probability generating function given in 1. By calculating derivatives we 
obtain the expectation and the variance as given in 2. 



Proof of 3. and 4. 

All we have to do is to simplify H8.1|l for more efficient computation. Therefore we define 



E(/W)* 



e=2 



(the definition of Xk matches the definition in H4.5|l as we will show below) and 
Po := 1, 

l<£^<...<£i m=l ■ l<£^,...,fi pwd 

Here pwd means pairwise different. With this definition, for z > 0, 

P[Z = Z] = yp,. 



We will show first 

Pz 

7. \ 

3 = 1 

with Xj given by (|4.5|l , which gives Pz recursively. Then we we calculate Pz as 



(8.2) 



-(E(-l)'"V.-,x,), (8.3) 



which gives part 2. of Theorem 
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For (IH31) define 

^ l<£i,.../. pwd 

Then pz = b^.i/z, Xk = hi_k, and consequently 

-.00 -. 

h.+i,k = J2 /(^i) ■ ■ ■ /(^-) (( E /(j')') - /(^i)' - ■ ■ ■ - /(^-)') ^ i^-.i^fc - ^-.'=+1 

■ i<iu...,e, j=2 
Therefore we can write 

z-l z-l 

Here the second equality follows because the sum telescopes. This gives (|8.3|l . 
The second equation, H8.4|) is proved by induction. Instead of (|8.4|) we prove 

which then gives 1)8. 4|l as the sum is over all vectors j of length k which sum up to z. Every such 
vector can be translated into a configuration a with ^ = z where at is the number of i's in 
j. As for a given length k of the vector j there are ^^i'^';^^, of these vectors leading to the same 
configuration 18.4|l is the same as (|8.5|) . 

For z = 1 H8.5|l gives pi = xi which is true by definition of p^ and Xz- Assume the formula is 
correct for 1, . . . , z and use H8.3|l to conclude that 



7,- / z + 1 

3=1 fe=i r-n+---+jk=z+i-3 i=i 



Xz+l 



Xz+l- 



k=l 3 = 1 + 1=1 

Since for every 1 < m < fc + 1 

z+l-fe fc fe+1 

E E -.nt= E -.^=n^- E 



then 



2+1-fc fe fc+i fc+i 

E E -.n^-^ E n^E^™ 

= ^ \ " TT i5Zi 

k + \ ^ ji ' 



and therefeore 



fe=2 i:ji + ...+Jfc=z+li=l 

= E(-ir"^"'^^ E Ht' 

fc=l j:ji+...+Jfc=z+li=l 
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which completes the induction and hence proves (|8.5|) . 

To show that the definition of Xk from H8.2|l coincides with (|4.5|) we define 



which gives 



Thus for k,z > l,k + z > 3 (otherwise the right side is not defined) 



{t + 2f~^{t-lY-^\l-l (. + 2 

Assume we do not sum to oo but to a large finite TV such that ^i^o and Ao,i exist. It can be 
proved by induction on A; + z that 

where (^^) = 1. Using this we have 

Ak,k = (-!)'( E C'k- ^ ^) + ^^^^ " °dd})^o,,) + oQ). 
j=i \ / 

So, as 

Ao,, = 3J"-iC(j), ^,,o = 3^-i(C(.?)-&.), ^^ = 1 + ^ + ^ 
we can write, now also for N ^ oo 

which shows that Xk is of the form (|4.5|) . This completes the proof of the theorem's assertion 3, 
from which the weights claimed in assertion 4 follow by inspection. 

Proof of 1. 

To obtain the probability generating function we now calculate 

git) := E[t-] -J:^^P[Z = z] = IY.E E (^^^^^ U f 

^ oo , ^-.f^ k 

fc=o ■ i=l j=l ■' 
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where we have used (|8.5f) . The sum in the exponential simpUfies to 



oo 



J t^^J-M* + 2)(*-l)y ^ °V (^ + 2)(^-l) 

'i(i + l) + 2(t- 1) 



(* + 2)(*-l) 

which proves the formula for the probability generating function 
Proof of 2. 

We calculate the first two derivatives of the generating function: 
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So 
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E[Z]=,9'(1) = 1, 

OO ^ 

Var[Z] = E[Z2] - 1 = E[Z(Z - 1)] = .g"(l) = 1 - 4^ ^27-^1)2 

i=2 ^ 

and the last assertion follows by 

CXD CC 

i=2 ^ ^ i=2 
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